On Augmenting Trace Cache for High-Bandwidth Value Prediction
نویسندگان
چکیده
Value prediction is a technique that breaks true data dependences by predicting the outcome of an instruction and speculatively executes its data-dependent instructions based on the predicted outcome. As the instruction fetch rate and issue rate of processors increase, the potential data dependences among instructions issued in the same cycle also increase. Value prediction and speculative execution become critical to keep the issue rate high. Unfortunately, most of the proposed value prediction schemes focused only on the accuracy of the prediction. They have yet to consider the bandwidth required to access the value prediction tables. In this paper, we focus on the bandwidth issues of the value prediction. We propose augmenting the trace cache [19], [26] (which was proposed to provide the required fetch bandwidth for wide-issue ILP processors) with a copy of the predicted values and moving the generation of those predicted values (which require accessing the value prediction tables) from the instruction fetch stage to a later stage, e.g., the writeback stage. Such a change will allow “selective value prediction,” i.e., only those instructions which require value prediction will access the value prediction tables. It can significantly reduce the bandwidth requirement of value prediction tables. We also use a dynamic classification scheme to steer predictor updates to behavior-specific tables (such as last-value, stride, two-level, etc.). A relatively even split among such table accesses further moderates the bandwidth requirement of those tables.
منابع مشابه
A Trace Cache Microarchitecture and Evaluation
As the instruction issue width of superscalar processors increases, instruction fetch bandwidth requirements will also increase. It will eventually become necessary to fetch multiple basic blocks per clock cycle. Conventional instruction caches hinder this effort because long instruction sequences are not always in contiguous cache locations. Trace caches overcome this limitation by caching tra...
متن کاملPerformance Limits of Trace Caches
A growing number of studies have explored the use of trace caches as a mechanism to increase instruction fetch bandwidth. The trace cache is a memory structure that stores statically non-contiguous but dynamically adjacent instructions in contiguous memory locations. When coupled with an aggressive trace or multiple branch predictor, it can fetch multiple basic blocks per cycle using a single-p...
متن کاملDesign of Trace Caches for High Bandwidth Instruction Fetching
In modern high performance microprocessors, there has been a trend toward increased superscalarity and deeper speculation to extract instruction level parallelism. As issue rates rise, more aggressive instruction fetch mechanisms are needed to be able to fetch multiple basic blocks in a given cycle. One such fetch mechanism that shows a great deal of promise is the trace cache, originally propo...
متن کاملMemshare: a Dynamic Multi-tenant Memory Key-value Cache
Web application performance is heavily reliant on the hit rate of memory-based caches. Current DRAM-based web caches statically partition their memory across multiple applications sharing the cache. This causes under utilization of memory which negatively impacts cache hit rates. We present Memshare, a novel web memory cache that dynamically manages memory across applications. Memshare provides...
متن کاملMemshare: a Dynamic Multi-tenant Key-value Cache
Web application performance heavily relies on the hit rate of DRAM key-value caches. Current DRAM caches statically partition memory across applications that share the cache. This results in under utilization and limits cache hit rates. We present Memshare, a DRAM key-value cache that dynamically manages memory across applications. Memshare provides a resource sharing model that guarantees rese...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- IEEE Trans. Computers
دوره 51 شماره
صفحات -
تاریخ انتشار 2002